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Interatomic potentials have been widely used in atomistic simulations such as molecular dynam¬ 
ics. Recently, frameworks to construct accurate interatomic potentials that combine a systematic 
set of density functional theory (DFT) calculations with machine learning techniques have been pro¬ 
posed. One of these methods is to use compressed sensing to derive a sparse representation for the 
interatomic potential. This facilitates the control of the accuracy of interatomic potentials. In this 
study, we demonstrate the applicability of compressed sensing to deriving the interatomic potential 
of ten elemental metals, namely, Ag, Al, An, Ca, Cu, Ga, In, K, Li and Zn. For each elemental 
metal, the interatomic potential is obtained from DFT calculations using elastic net regression. The 
interatomic potentials are found to have prediction errors of less than 3.5 meV/atom, 0.03 eV/A 
and 0.15 GPa for the energy, force and the stress tensor, respectively, which enable the accurate 
prediction of physical properties such as lattice constants and the phonon dispersion relationship. 

PACS numbers: 71.15.Pd,31.50.Be,34.20.-b,65.40.-b 


I. INTRODUCTION 


Molecular dynamics (MD) has been a popular tool for 
modeling a collection of interacting atoms within classi¬ 
cal mechanics [l|. The relationship between energy and 
atomic coordinates, namely, the potential energy surface 
(PES), plays a key role in MD simulations since the PES 
determines the forces acting on atoms that originate from 
atomic interactions and therefore the motion of atoms. 
As an alternative to first principles MD calculations, 
which provide the most accurate PES0, frameworks to 
estimate a reliable PES based on the combination of 
systematic density functional theory (DFT) calculations 
and machine learning techniques have recently been pro¬ 
posed, which are applicable to periodic systems[3,l3. The 
starting point of these methods is that a DFT calculation 
is performed for at least 10^ different atomic configura¬ 
tions. A PES is then constructed from the DFT training 
data set using regression techniques to estimate the rela¬ 
tionship between predictor and observation variables. Its 
accuracy is known to be much better than that of con¬ 
ventional interatomic potentials owing to the flexibility 
of the method. The flexibility also makes it possible to 
construct the PES for a wide range of materials using the 
same method. 

To estimate the PES from a data set of the energy 
for many atomic configurations, a variety of methods 
can be applied. For ^plications to molecules and clus¬ 
ters, spline method^, Q, interpolating moving least- 
squares methods [3, [J modified Shepard interpolation 
and other in terp olation techniques [91-1 1 1| . artificial neu¬ 
ral networks 112-1^ a nd the reproducing kernel Hilbert 
space method^U" 311 have been used. However, only a 
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few nonlinear regression techniques such as artificial neu¬ 
ral networks [32 - l33 | and Gaussian process regression^ 
have been adopted to estimate the PES for solids on ac¬ 
count of the complex relationship between the energy and 
crystal structure in solids. Thus, applications to solids 
have been limited to a small number of metallic [13, [13, 
covalent [3,0, nil and ionic materials [Hi ■ 

In these methods, the PES is generally estimated 
by transforming atomic positions into some descriptors. 
This plays an essential role in constructing a PES that 
satisfies several invariances, such as translational and 
rotational invariance, and is transferable to structures 
composed of a different number of atoms from those in 
the training data. Obviously, the accuracy of the PES 
strongly depends on the selection of the descriptors. So 
far, several descriptors for expressing atomic coordinates 
have been proposed [H - IHjl . although only some of them 
have succeeded in obtaining accurate PESs. Descriptors 
are preferably chosen without a priori knowledge of the 
energetics of the target material. Therefore, it is desir¬ 
able to establish a method that enables automatic opti¬ 
mization of the descriptors for constructing a PES. 

The use of least absolute shrinkage and selection op¬ 
erator (LASSO) regression[43 is a promising means 
of enabling the automatic selection of descriptors. We 
previously introduced a simple scheme to estimate the 
PES using LASSO regression that was based on a set 
of simple and systematic basis functions and a linear re¬ 
lationship between the energy and basis functions [Hj. 
which was applied to the elemental metals of Na and 
Mg. It was found that a sparse representation for the 
PES with a small number of basis functions was effi¬ 
ciently derived from relatively a large number of sys¬ 
tematic candidate basis functions depending only on the 
distances between atoms. We also found that the en¬ 
ergy can be expressed by a linear relationship with the 
basis functions. As a result, sparse PESs with predic- 
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tion errors of 1.3 and 0.9 meV/atom were obtained for 
Na and Mg, respectively. In addition to our applica¬ 
tion of LASSO regression for PES construction, it has 
recently been used to obtain sparse representations for al¬ 
loy thermodynamics [13, interatomic force constants 
and the PES for a molecule [i^. 

The scheme to construct the PES on the basis of 
LASSO regression with a linear relationship for energy 
has a number of advantages: 1) Accuracy can be con¬ 
trolled in a transparent manner. 2) A well-optimized 
sparse representation for the PES is obtained, which can 
accelerate and increase the accuracy of atomistic simula¬ 
tions while decreasing the computational costs. 3) Infor¬ 
mation on the forces acting on atoms and stress tensors 
can be included in the training data in a straightforward 
manner. 4) Regression coefficients are generally deter¬ 
mined quickly using a standard least-squares technique. 
5) The number of regression coefficients does not explic¬ 
itly depend on the size of the input data set. 

In this study, we apply this scheme to construct PESs 
for ten elemental metals, namely, Ag, Au, Al, Ca, Cu, 
Li, K, Ga, In and Zn. Here, we use elastic net regression, 
which is a generalization of LASSO regression. This pa¬ 
per is organized as follows. Sec. [H] presents the method¬ 
ology including the linear expression for the total energy, 
the systematic basis functions and the regression tech¬ 
niques. Linear expressions for the forces acting on atoms 
and the stress tensors are also derived from the expres¬ 
sion for the total energy. In Sec. IIIIl the procedure for 
optimizing the input factors used to estimate the PES 
is described. In Sec. IlYl the application of elastic net 
regression to the ten elemental metals is demonstrated. 
Finally, we give a conclusion in Sec. |Vl 


II. METHODOLOGY 

A. Linear expressions for total energy, forces 
acting on atoms and stress tensor 

To model the relationship between the total energy and 
the crystal structure, we adopt the linear expression for 
the total energy proposed in Ref. IH. Figure |T] shows the 
linear model for the total energy, which is based on the 
widely accepted idea that the total energy of a structure 
is equal to the sum of its atomic energie^ 0] . Introduc¬ 
ing a basis expansion derived from a set of other atomic 
positions, this model assumes a linear relationship be¬ 
tween the energy of atom j in structure i, and the 

set of basis functions for atom j in structure i, 

The linear relationship between the atomic energy and 
M given basis functions is expressed as 

M 

( 1 ) 

n—0 



FIG. 1. Relationship between total energy and crystal struc¬ 
ture. 


U j) 

6g = I. By applying the same expansion coefficients 
to identical atomic species, the total energy of structure 
i composed of atoms is derived as 


aW = 


i,j) 


= E^ 


3 


J) 




where Xn'^ satishes 
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( 3 ) 


Consequently, the total energy of structure i is expressed 
as a linear relationship with the sum of the basis functions 
for all atoms in structure i. 

The forces acting on atoms and the stress tensor can be 
given by linear equations as well as the total energy (Ap¬ 
pendix]^ for details), ath component of the force acting 
on atom I and the virial stress tensor <Jap of structure i 
are expressed as 


and 


= Vic 


= 


^n^stress.n’ 


( 4 ) 

( 5 ) 


respectively, where and a;stress li can be derived 

from the derivative of the basis functions with respect to 
the atomic coordinates as will be shown later. 


B. Estimation of regression coefficients 


where and denote the expansion coefficients and The expansion coefficients w = [wq, wi, • • • , 
basis functions for atom j of structure i, respectively, and characterizing the energetics of a system can be estimated 
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by regression, which is a machine learning method for es¬ 
timating the relationship between the predictor and ob¬ 
servation variables using a training data set. Regarding 
the training data, the energy, the forces acting on atoms 
and the stress tensor computed by DFT calculation can 
all be used as observations in the regression process, since 
all of them can be expressed by linear equations with 
the same expansion coefficients. When considering only 
the energy as observations, the predictor matrix X and 
observation vector y correspond to Xenergy and i/energyi 
which are composed of Xn and the DFT energies for the 
structures in the training data, respectively, that is, 


X = X 


energy, 


y — 2/energy- 


( 6 ) 


When using the energy, forces and stress tensor as obser¬ 
vations, X and y are written as 


-^energy 


2/energy 

-Xforce 

, V = 

2/force 

-^stress 


2/stress 


where Xforce and X^tress are composed of and 

2^stress ri f^e sti'uctures in the training data, respec¬ 
tively. j/force and ystress are composed of the forces and 
stress tensor computed by the DFT calculation for the 
structures in the training data, respectively. 

A simple procedure to estimate the expansion coeffi¬ 
cients is to use linear ridge regression. This is a shrinkage 
method and shrinks the regression coefficients by impos¬ 
ing a penalty. The ridge coefficients minimize the penal¬ 
ized residual sum of squares expressed as 

L{w) = \\Xw - y\\l + X\\w\\l, (8) 


where A controls the magnitude of the penalty. This 
is referred to as L 2 regularization. The solution is eas¬ 
ily obtained only in terms of matrix operations as in = 
(X^X -I- XI)~^X^y, where I denotes the unit matrix. 
Therefore, the regression coefficients can be easily es¬ 
timated while avoiding the well-known multicollinearity 
problem occurring in the ordinary least-squares method. 

Although linear ridge regression is useful for obtaining 
a PES from a given basis set, a basis set appropriate for 
the system of interest is generally unknown. Moreover, a 
PES with a small number of basis functions is desirable 
to decrease the computational cost in atomistic simula¬ 
tions. Therefore, we use elastic net regressionji^ in 
combination with the preparation of a considerable num¬ 
ber of basis functions. Elastic net regression is a general¬ 
ization of LASSO regression [13, and combines the Li 
and L 2 penalties. Elastic net regression enables us not 
only to provide a solution for linear regression but also 
to obtain a sparse representation with a small number of 
nonzero regression coefficients. The solution is obtained 
by minimizing the function 

L{w) = ||Xin -y||^ -kaAlliulli -k (9) 


where the parameter a determines the mixing of the 
penalties. When a = 1, the minimization function corre¬ 
sponds to that of the LASSO. The accuracy of the solu¬ 
tion can be controlled simply by adjusting the values of 
A and a for a given training data set. 

The use of elastic net regression allows us to avoid 
several limitations of the LASSO. For example, if there 
is a group of highly correlated predictor variables, the 
LASSO tends to select only one variable from the group. 
Also, in the case of high-dimensional predictor variables 
with few observations, the LASSO selects at most the 
same number of predictor variables as the number of ob¬ 
servations before the solution saturates. 

Note that the units for the energy, forces and stress ten¬ 
sor are different, hence care is required in the selection 
of the units. The units act as weights in the regression. 
Here, we used the units of eV/supercell, eV/A and GPa 
for the energy, forces and stress tensor, respectively, when 
considering all of them as observations. When consider¬ 
ing only the energy as observations, the unit of eV/atom 
can also be used. In this study, regression coefficients 
were estimated using the standardized training data. 


C. Basis functions 


In this study, the following simple form of the basis 
functions is newly used as the linear expression for the 
energy. The pth power of the nth element for atom / of 
structure i, is written as 


n,p 


_ k 


( 10 ) 


where p is a positive integer and denotes the distance 
between atoms j and k of structure i. The sum is taken 
over all atoms within a cutoff radius Rc from atom j. 
fn{R^jl) and fc{R!fl) are an analytical pairwise function 
and a smooth pairwise cutoff function that is zero at a 
distance greater than respectively. Since the prod¬ 
uct of fn and fc is pairwise, an exponential form of the 
sum of the pairwise functions is introduced to take many- 
body effects into account. Although pairwise functions 
are adopted here for /„, other types of basis functions 
such as angular basis functions can be used in principle. 

Taking the sum of the basis functions for all atoms, 
Xn}p is obtained as 


.(* 

ii.; 


E 


_ k 


( 11 ) 


Using a combination of the linear model and this form 
of the basis functions, the expression for the total energy 
is invariant to the translation, rotation and exchange of 
atoms. In addition, it can be used to input crystal struc¬ 
tures with a different number of atoms from the struc¬ 
tures in the training data. 
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Si,l,a) 

‘^force,n,p 


and X 


(i,a,P) 
stress,n,p 


can also be derived as 


‘^force,n,p 


P2^^n,p-1 li) 
J "^l,a 


and 


(i,a,P) ^ _P_\^ f>ii) dhj) 

■^stress,n,p Z_^ ^n,p-l „^(i) 

I J 


( 12 ) 


(13) 


respectively (Appendix for details). The derivative of 
the basis functions with respect to the ath component of 
the atomic position is written as 


^ = E lfniRjk)mjk) + fn{Rjk)mjk)] 

(14) 

where the structure index i is omitted. 

For the pairwise analytical function /„, we intro¬ 
duce Gaussian, cosine, Bessel, Neumann, modified Mor- 
let wavelet (MMW), Slater-type orbital (STO) and 
Gaussian-type orbital (GTO) functions. Table U shows 
the different function forms of /„ and their derivatives 
with respect to the distance used in this study. The 
derivatives can be seen in the expressions for the forces 
acting on atoms and the stress tensor. For the cosine 
and MMW types, function forms with a single internal 
parameter are introduced, while for the Gaussian, STO 
and GTO types, function forms with two internal pa¬ 
rameters are used. Using a number of functions with 
different internal parameters for each type of function, a 
systematic set of basis functions used to select the basis 
in elastic net regression is obtained. For the cutoff func¬ 
tion, we adopt the cosine-based cutoff function used in 
Ref. [^, expressed as 


MR) 



{R < Rd 

{R > Rc) 


(15) 


III. OPTIMIZATION OF INPUT FACTORS 

The accuracy of the elastic net PES mainly depends on 
1) the cutoff radius Rc, 2) the size of the training data, 

3) the variety of structures included in the training data, 

4) the observation properties used for regression, 5) the 
candidate basis functions and 6) the parameters a and 
A in the minimization function of the elastic net regres¬ 
sion. However, it is difficult to optimize all of these input 
factors simultaneously. We therefore optimize them in a 
stepwise manner. 


A. DFT data set 

To begin with, training and test data sets were gener¬ 
ated from systematic DFT calculations. The test data set 


was used to examine the predictive power for structures 
that were not included in the training data set. We gener¬ 
ated 2700 and 300 configurations for the training and test 
data sets, respectively, for each elemental metal. They 
include structures made by isotropic expansions, random 
expansions and random distortions of ideal face-centered- 
cubic (fee), body-centered-cubic (bcc), hexagonal-closed- 
packed (hep), simple cubic (sc), oj and ,5-tin structures, 
in which the atomic positions and lattice constants were 
fully optimized. Random structures were generated using 
Gaussian random numbers with ten different values for 
the variance. These configurations were made using su¬ 
percells constructed by the 2x2x2, 3x3x3, 3x3x3, 
4x4x4, 3x3x3 and 2x2x2 expansions of the 
conventional unit cells for fee, bcc, hep, sc, w and /3-tin 
structures, which are composed of 32, 54, 54, 64, 81 and 
32 atoms, respectively. 

For a total of 3000 configurations for each ele¬ 
mental metal, DFT calculations were performed using 
the plane-wave basis projector augmented wave (PAW) 
method[5l|, within the Perdew-Burke-Ernzerhof 

exchange-correlation functionalas implemented in 
the VASP code [13, The cutoff energy was set to 

400 eV. The total energies converged to less than 10“^ 
meV/supercell. For only the ideal structures, the atomic 
positions and lattice constants were optimized until the 
residual forces became less than 10 “^ eV /A. 


B. Cutoff radius 

The optimal cutoff radius was determined for each el¬ 
emental metal using linear ridge regression with a given 
set of 180 basis functions consisting of 18 Bessel, 18 Neu¬ 
mann, 60 cosine and 84 Gaussian functions. We searched 
for a cutoff radius giving a low prediction error by con¬ 
structing PESs using several cutoff radii ranging from 5 
to 10 A with intervals of 0.5 A. The ridge penalty term 
was set to A = 10“"^. To estimate the prediction error 
of each PES, we calculated the root-mean-square error 
(RMSE) between the observation property predicted by 
the DET calculation and that predicted by the PES for 
the test data. Then, the convergence of the RMSE was 
examined. Here, the energy, the force and the stress ten¬ 
sor were used as the observation properties. In addition, 
the cutoff radius also plays an essential role in expressing 
the energy of structures with large volumes. However, 
the contribution of such structures to the RMSE was mi¬ 
nor in our data sets. Therefore, we determined the cutoff 
radii using the convergence of the energy-volume curve 
in addition to the convergence of the RMSE. 

Table U shows the optimized cutoff radii. Hereafter, 
these optimized values will be used. Table m also shows 
the RMSEs of the energy, force and stress tensor. We 
obtained PESs with the RMSEs for the energy ranging 
from 0.5 to 4.0 meV/atom. In other words, PESs with 
high accuracy were obtained even using a given set of 
basis functions and linear ridge regression. The given set 
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TABLE I. Adopted analytical pairwise functions for fn{R) and their derivatives f^{R). The internal parameters a and b are 
always positive except for parameter a in the STO and GTO functions. For the Bessel and Neumann functions, n is a positive 
integer. 



UR) 

UR) 

Bessel 

MR) 

-fi{R) (n = 0), [/n-i(R) - fn+i] /2 (n > 1) 

Neumann 

W(R) 

-fi{R) (n = 0), [/n-i(R) - fn+i] /2 (n > 1) 

Cosine 

cos(ai?) 

—asin(ai?) 

Modified Morlet wavelet (MMW) 

cos (ait)/cosh (7?) 

—asin(ai?)/cosh(R) — f„{R) tanh{R) 

Gaussian 

exp [—a(R — 6)^] 

-2a{R - b)fn[R) 

Slater-type orbital (STO) 

R“exp(-6R) 

[a - bR)R‘^-^ exp i-bR) 

Gaussian-type orbital (GTO) 

exp(-6R2) 

{a - 2bR^)R°-^ exp {-bR^) 


TABLE 11. Optimal cutoff radius Rc- PESs are constructed 
by linear ridge regression using a given basis set composed 
of 180 basis functions including 18 Bessel, 18 Neumann, 60 
cosine and 84 Gaussian functions. 

Linear ridge regression (A'basis = 180) 

Rc RMSE (energy) RMSE (force) RMSE (stress) 
(A) (meV/atom) (eV/A) (GPa) 


Ag 

7.5 

2.4 

0.012 

0.08 

A1 

8.0 

4.0 

0.020 

0.15 

Au 

6.0 

3.0 

0.030 

0.16 

Ca 

9.5 

1.2 

0.011 

0.03 

Cu 

7.5 

2.6 

0.018 

0.10 

Ga 

10.0 

1.9 

0.019 

0.11 

In 

10.0 

1.9 

0.017 

0.07 

K 

10.0 

0.5 

0.001 

0.00 

Li 

8.5 

0.5 

0.005 

0.02 

Zn 

10.0 

3.0 

0.021 

0.22 


of basis functions may be acceptable for expressing the 
energetics of the ten different elemental metals. However, 
this may not be the case with other systems. 


C. Observation property 


TABLE III. Optimized candidate basis set used in elastic 
net regression. Min. and Max. stand for the minimum and 
maximum values of the arithmetic sequence of the internal 
parameter, respectively. Aseq denotes the number of compo¬ 
nents of the sequence. 


Basis type 

Number of 


Internal parameter 

basis functions 


Min. 

Max. 

Naeq 

Bessel 

18 

n 

0 

5 

6 

Neumann 

18 

n 

0 

5 

6 

Cosine 

300 

a 

0.1 

10.0 

100 

MMW 

300 

a 

0.1 

10.0 

100 

Gaussian 


a 

0.1 

2.0 

20 

1200 



b 

0.0 

5.0 

20 

STO 


a 

-2 

2 

5 

1500 






b 

0.1 

10.0 

100 

GTO 


a 

-2 

2 

5 

1500 






b 

0.1 

10.0 

100 

Total 

4836 






force and the stress tensor, which is essential for calcula¬ 
tions of phonon dispersions and structure optimization, 
we hereafter use the energy, force and stress tensor as the 
observation properties unless otherwise specified. 


The dependence of the PES accuracy on the obser¬ 
vation property of the training data was examined. We 
compared two training sets of observation properties used 
for regression. One training set was composed only of the 
energy, and the other set is composed of the energy, force 
and stress tensor. The comparison was carried out using 
the above 180 basis functions, the optimized cutoff radius 
and linear ridge regression. When using only the energy 
as the observation property, the RMSE was small for the 
energy but large for the force and the stress tensor. In¬ 
cluding the force and the stress tensor to the observation 
properties resulted in improved prediction for the force 
and the stress tensor at the expense of the predictive 
power for the energy. To ensure accuracy for both the 


D. Candidate basis set 

Even using a combination of linear ridge regression 
and the above 180 basis functions, PESs with high ac¬ 
curacy were sometimes obtained. However, a PES with 
a smaller number of basis functions is generally prefer¬ 
able to accelerate the computation of the energy, forces 
and stress tensors. Additionally, it may be possible to 
find more suitable basis functions by considering other 
basis functions. Therefore, it is useful to select suitable 
basis functions from a candidate basis set including other 
basis functions by elastic net regression. 

In general, a candidate basis set should ideally be com¬ 
pact for the following reasons. 1) If the elastic net uses 
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FIG. 2. Dependence of RMSE of linear ridge PES on the number of basis functions for (a) cosine and (b) MMW types. Max. 
stands for the maximum value of the arithmetic sequence of the internal parameter. 


too many basis functions compared to the number of 
input observations, the selection of a good set of basis 
functions tends to be difficult. 2) The amount of avail¬ 
able memory on computers can be exhausted, particu¬ 
larly when the forces and stress tensor are used as the 
observations. 

A compact candidate basis set was obtained by opti¬ 
mization on the basis of the results of linear ridge regres¬ 
sion with a single type of basis function. For a basis type 
with a single internal parameter, a trial set for the inter¬ 
nal parameter was given by an arithmetic sequence. The 
sequence can be specified by the minimum and maximum 
values and the number of components of the sequence. 
For the Bessel and Neumann types, we set the minimum 
and interval of the arithmetic sequence to zero and one, 
respectively. For the cosine and MMW functions, the 
minimum value was taken to be 0.1. Linear ridge PESs 
were then constructed using many trial sets for each ba¬ 
sis type. Here, all the basis functions with p = 1,2,3 
for each /„ were considered because it has been shown 
that the use of p = 1,2,3 terms greatly decreases the 
prediction error for elemental Na and Mgjd^. The ridge 
penalty term was taken to be A = 10“^. 

Figures [5] (a) and (b) show the convergence of the 
RMSE for the energy with respect to the number of ba¬ 
sis functions for the cosine and MMW types. Unfixed 
parameters of the sequence were determined from the 
convergence of the RMSE. Table Hill shows the optimized 
number of basis functions together with the minimum 


and maximum values of the sequence. We also tested 
polynomial forms, three types of exponential forms and 
Mexican hat wavelets with a single internal parameter as 
candidates of /„. However, they showed a larger RMSE 
or unstable behavior when performing regression. 

For each basis type with two internal parameters, two 
arithmetic sequences were given and all combinations 
of their components were considered. For the Gaussian 
type, the minimum values of the sequences for a and b 
were fixed to 0.1 and 0.0, respectively. The number of 
components of each sequence was set to 20. Therefore, 
each sequence was specified only by the maximum value. 
For the STO and GTO types, the sequence for a was 
fixed so that the minimum value, the interval and the 
maximum value were given as —2, 1 and 2, respectively. 
The minimum value of the sequence for b was set to 0.1. 
The sequence for b was specified by the maximum value 
and the number of components of the sequence. Table Hill 
shows the optimized sequences. We used all the types of 
basis functions shown in Table m as the candidate basis 
set. The total number of functions in the candidate basis 
set was 4836. 


IV. ELASTIC NET POTENTIAL ENERGY 
SURFACE 

PESs were then estimated by elastic net regression us¬ 
ing the DFT observations, the candidate basis set and 
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FIG. 3. Dependence of RMSEs for the energy and the stress tensor of elastic net PES (a = 1, LASSO) on the number of the 
basis functions for ten elemental metals. RMSEs for the energy and the stress tensor are shown by orange open circles and 
blue open squares, respectively. 


TABLE IV. RMSEs for the energy, the force and the stress tensor of elastic net PESs showing the minimum criterion score. 
Equilibrium lattice constants for the bcc and fee structures estimated from the elastic net PES are also shown. Values in 
brackets were obtained directly by DFT calculation. 


Element 

Number of 

basis functions 

RMSE (energy) 
(meV / atom) 

RMSE (force) 
(eV/A) 

RMSE (stress) 
(GPa) 

a (bcc) 

(A) 

a (fee) 

(A) 

Ag 

190 

2.2 

0.011 

0.07 

3.309 (3.311) 

4.157 (4.160) 

A1 

210 

3.5 

0.020 

0.12 

3.234 (3.233) 

4.039 (4.038) 

Au 

165 

2.4 

0.030 

0.15 

3.316 (3.309) 

4.172 (4.164) 

Ca 

234 

1.2 

0.010 

0.03 

4.383 (4.381) 

5.522 (5.519) 

Cu 

202 

2.6 

0.018 

0.12 

2.885 (2.887) 

3.630 (3.633) 

Ga 

266 

2.2 

0.017 

0.09 

3.371 (3.371) 

4.227 (4.228) 

In 

253 

2.3 

0.019 

0.07 

3.814 (3.815) 

4.797 (4.797) 

K 

197 

0.3 

0.001 

0.00 

5.284 (5.283) 

6.666 (6.662) 

Li 

222 

0.4 

0.005 

0.02 

3.440 (3.439) 

4.329 (4.331) 

Zn 

288 

2.9 

0.016 

0.15 

3.130 (3.136) 

3.928 (3.935) 


the optimized cutoff radius. PESs were optimized by 
changing the parameters a and A in the minimization 
function of Eqn. We varied A from 10^ to 10 ^ 

and adopted values of a of 0.6, 0.8 and 1. In elastic 
net regression, we used only the energy and stress tensor 
as the observations because the available computational 
memory was limited. Although the regression coefficients 
obtained by the elastic net were valid as they were, we 
reestimated them using linear ridge regression, where the 
energy, forces, and stress tensor were used as the obser¬ 
vations. 

As a criterion score to determine the optimal PES, the 
average of the RMSEs for the energy and the stress tensor 
was used. Here, we regarded the PES with the lowest cri¬ 
terion score as the optimal one. Note, however, that the 


definition of the optimal PES depends on the purpose. In 
another situation, a PES with a smaller number of ba¬ 
sis functions may be regarded as the optimal one when 
a decrease in the computational costs at the expense of 
slight degradation of the accuracy is desired. 

Figure[3]shows the dependence of the RMSE for the en¬ 
ergy and stress tensor on the number of basis functions 
when a = 1. The number of selected basis functions 
tended to increase with decreasing A. At the same time, 
the RMSE for the energy and stress tensor tended to de¬ 
crease. Although multiple PESs with the same number 
of basis functions were sometimes obtained from different 
values of A, only the PES with the lowest criterion score 
among the PESs with the same number of basis functions 
is shown in Fig. [3] On the other hand, the criterion score 
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FIG. 4. Comparison of the energies predicted by the elastic 
net PES and DFT for (a) A1 and (b) Zn, measured from the 
energy of the most stable structure among the bcc, fee, hep, 
sc, uj and /3-tin structures. 


does not change significantly with decreasing a although 
the number of selected basis functions increases. There¬ 
fore, we will hereafter show only the results for a = 1. 

Table IV] shows the RMSEs for the energy, the force 
and the stress tensor of the optimal elastic net PES. We 
obtained PESs with the RMSE for the energy in the range 
of 0.3—3.5 meV/atom for the ten elemental metals using 
only 165—288 basis functions. The RMSEs for the force 
and the stress are within 0.03 eV/A and 0.15 GPa, re¬ 
spectively. Compared with the RMSEs of the PESs con¬ 
structed from the given basis set shown in Table im the 
prediction errors are reduced for some elemental metals 
as a consequence of the automatic optimization of the 
basis set. Table US also shows the equilibrium lattice 
constants of the bcc and fee structures for the ten ele¬ 
mental metals predicted by the elastic net PES, together 
with those predicted by DFT. The PESs have equilib¬ 
rium lattice constants that are in agreement with those 
obtained by DFT. 

Figure|4]shows a comparison of the energies of test data 
predicted by the elastic net PES and DFT for A1 and Zn, 
showing the largest and second largest RMSEs for the 
energy. As can be seen in Fig. 01 there is little difference 
between the DFT and elastic net PES energies regardless 
of the crystal structure. In addition, no dependence of 
the RMSE on the energy can be clearly observed despite 
the wide range of structures included in both the training 
and test data. 

The applicability of the elastic net PES to the calcula¬ 
tion of the force was also examined by comparing phonon 
dispersion relationships computed by the elastic net PES 
and DFT. The phonon dispersion relationships were cal¬ 
culated by the supercell approachjs^ for the bcc and fee 
structures with the equilibrium lattice constant. To eval¬ 
uate the dynamical matrix, each symmetrically indepen¬ 
dent atomic position was displaced by 0.01 A. The forces 
acting on atoms by the elastic net PES can then be ana¬ 
lytically computed using Eqn. O- Supercells were made 
by 4 X 4 X 4 expansion of the conventional unit cells for 
both the bcc and fee structures. The phonon calculations 
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FIG. 5. Phonon dispersion relationships for ten elemental 
metals with (a) fee and (b) bee struetures. Phonon dispersion 
eurves obtained by the elastie net PES and DFT are shown 
by blue solid and orange broken lines, respeetively. Negative 
values indicate imaginary modes. 


were performed using the phonopy codejs^. Figure [5] 
shows the phonon dispersion relationships of the (a) bcc 
and (b) fee structures for the ten elemental metals, com¬ 
puted by both the elastic net PES and DFT. For all ele¬ 
mental metals with both the bcc and fee structures, the 
phonon dispersion relationships calculated by the elastic 
net PES are in good agreement with those calculated by 
DFT. This demonstrates that the elastic net PES is suf¬ 
ficiently accurate to perform atomistic simulations with 
similar accuracy to DFT calculation. 


V. CONCLUSION 

We have applied a method of constructing a linearized 
PES by elastic net regression to a wide range of elemental 
metals. Compared with the other approach based on sys¬ 
tematic first-principles calculations, the elastic net inter¬ 
atomic potential has the following main advantages. 1) 
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A well-optimized sparse representation for the PES can 
be obtained, which increases the accuracy of atomistic 
simulations while decreasing the computational cost. 2) 
The accuracy can be easily controlled, i.e., the trade-off 
between the accuracy and computational cost is deter¬ 
mined by a small number of parameters. 3) Information 
on the forces acting on atoms and stress tensors can be in¬ 
cluded in the training data in a straightforward manner. 
This ensures the reliability of the force and stress tensor 
calculation using constructed interatomic potentials. 

As a result of applying the present method, we found 
that the energetics can be expressed by a linear rela¬ 
tionship with simple basis functions depending only on 
distances between atoms. A sparse set of suitable ba¬ 
sis functions for expressing the PES can also be easily 
extracted from 4836 basis functions by elastic net regres¬ 
sion. As a result, we have obtained a sparse PES with 
prediction errors ranging from 0.3 to 3.5 meV/atom. The 
prediction errors for the force and the stress tensor were 
within 0.03 eV/A and 0.15 GPa, respectively. Also, we 
compared equilibrium lattice constants and phonon dis¬ 
persion relationships obtained by the elastic net PES and 
by DFT calculation. The former were in good agreement 
with the latter for all ten elemental metals considering in 
this study. 
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i is expressed by 


where 


^l,a — 


dR 


(0 

l,OL 


dx 


(0 
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n,p 
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The stress tensor is generally obtained by virial stress 
computation. The virial stress tensor (Jap is expressed as 


AO 

fo/3 




(0 

Lol^LB ^ 


(A3) 


where V denotes the volume of the cell containing 
atoms. Using the expression for the forces in Eqn. 
the stress tensor is derived as the following linear equa¬ 
tion: 
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Appendix A: Expressions for forces acting on atoms 
and stress tensor 

Here, expressions for forces acting on atoms and the 
stress tensor are derived from the derivative of Eqn. @ 
with respect to the atomic positions provided in Garte- 
sian coordinates. Although they depend on the form of 
the basis functions, expressions for a linear model involv¬ 
ing only basis functions depending only on pair distances 
are derived. Since the total energy of structure i has a 
linear relationship with the sum of the basis functions, 
ath component of the force acting on atom I of structure 

I 




_ _ 

V 


J ^^1,0 


n,p 


= E' 

n,p 

= E' 


^n,p 




(i) Aid) 






n^p 


where 


(A4) 
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(i,a,/3) ^ _P_\^ n(i) 

^stress,n,p T/ ' 

I J 9R\j 


(A5) 


By computing all the contributions from atoms within 
the cutoff radius using Eqn. 0, the virial stress is 
obtained. 

The derivative of the basis functions with respect to 
the ath component of the atomic position is written as 


db. 


U) 

n,l 


dRi,c 


= E 


dfnjRjk) 

dRl,a 


fc{Rjk) + fn{Rjk) 


dfc{Rjk) 


dRi,o 


= E ifniRjk)fc{Rjk) + fu{Rjk)mjk)] 


dRjk 
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7,0! 


(A6) 
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where structure index i is omitted. Three types of deriva¬ 
tives can be found in Eqn. (IA6I) . The derivative of the 
distance with respect to the ath component of the atomic 
position is expressed as 


where Rj denotes the three-dimensional atomic position 
of atom j in Cartesian coordinates. The derivative of the 
cutoff function with respect to the distance is given by 

fciRjk) = sin . (A8) 


dRjk 

dRl,a 


Rjk 

_ 


{l=j) 

1 

{l = k) 


The derivative of the pairwise function /„ with respect 
to the distance depends on the selection of the functions, 
hence the expression for the derivative for each type of 
(A7) fn is shown in Table HI 
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